1 Introduction

Understanding what drives worker productivity is a key concern in labor-intensive industries such as garment manufacturing. This project focuses on identifying and quantifying the main factors that influence productivity in a real-world production setting. We use a dataset collected from a garment factory in Bangladesh, which records daily operational data for several teams. The data include variables related to production planning, workforce composition and task characteristics.

The primary goal is to build a statistical model that not only predicts productivity, but also explains how different variables affect output. Given the structure of the data, where multiple teams are observed over time, we adopt a hierarchical Bayesian approach to account for group-level variability, with a particular focus on the effect of team membership.

This report begins with an exploratory data analysis (EDA), which helps reveal trends, outliers and potential issues in the dataset. The insights gained from the EDA will guide the modeling process and ensure that the assumptions of the statistical framework are well grounded.

1.1 Dataset Overview

The dataset comes from the UCI Machine Learning Repository and includes 1,197 observations from a garment factory in Bangladesh. Each row corresponds to a single working day for a specific team.

The data provide a mix of numerical and categorical variables, capturing key aspects of the production environment.

Because teams are followed over time, the dataset has a natural hierarchical structure. This makes it especially well-suited for multilevel modeling techniques.

Below is a summary of the variables included in the dataset:

Variable Name Type Description
date Date Day of the observation (format: mm/dd/yyyy)
day Categorical Day of the week
team Categorical Team number (1 to 12)
targeted_productivity Numeric Standard productivity set by management for the task
smv Numeric Standard Minute Value: estimated minutes required for the task
wip Numeric Work in Progress: unfinished items carried from previous days
over_time Numeric Minutes of overtime worked on that day
incentive Numeric Monetary incentive offered to the team
idle_time Numeric Total idle time (minutes) during the day
idle_men Numeric Number of idle workers during the day
no_of_style_change Numeric Number of times the style of garments changed during the day
no_of_workers Numeric Number of workers in the team
actual_productivity Numeric Actual productivity achieved on that day
quarter Categorical Ordinal week of the month (from 1 to 5) in which the observation was recorded.

1.2 Objectives

The primary objective of this project is twofold. First, it aims to provide a thorough descriptive and exploratory analysis of the dataset by employing both visual tools and statistical summaries. This preliminary phase is essential for uncovering potential patterns in the data, assessing the presence of outliers or anomalies and evaluating the structure and distribution of key variables. The insights gained in this phase also serve to guide the subsequent modeling choices and ensure that the data are adequately preprocessed and structured for inference.

Second, the project seeks to develop a hierarchical Bayesian model for predicting actual worker productivity. The model is designed not only to quantify the relationship between productivity and a range of relevant predictors, including both continuous and categorical variables, but also to incorporate a group-level random effect for the variable team. By doing so, the model accounts for unobserved heterogeneity across teams and allows for team-specific deviations from the population-level effects. This hierarchical structure is particularly well-suited for organizational data, where individual observations are naturally nested within larger units, and supports more accurate estimation and interpretation of both individual- and group-level influences on productivity.

2 Exploratory Data Analysis

We begin our exploratory data analysis by reviewing the structure of the dataset, presenting summary statistics for both numerical and categorical variables and identifying any inconsistencies that require cleaning prior to further analysis.

First 6 Rows of the Raw Dataset
date quarter day team targeted_productivity smv wip over_time incentive idle_time idle_men no_of_style_change no_of_workers actual_productivity
1 1/1/2015 Quarter1 Thursday 8 0.80 26.16 1108 7080 98 0 0 0 59.0 0.9407254
3 1/1/2015 Quarter1 Thursday 11 0.80 11.41 968 3660 50 0 0 0 30.5 0.8005705
4 1/1/2015 Quarter1 Thursday 12 0.80 11.41 968 3660 50 0 0 0 30.5 0.8005705
5 1/1/2015 Quarter1 Thursday 6 0.80 25.90 1170 1920 50 0 0 0 56.0 0.8003819
6 1/1/2015 Quarter1 Thursday 7 0.80 25.90 984 6720 38 0 0 0 56.0 0.8001250
8 1/1/2015 Quarter1 Thursday 3 0.75 28.08 795 6900 45 0 0 0 57.5 0.7536835

We compute descriptive statistics for all numerical variables and summarize key characteristics of the categorical variables, including the presence of missing values. This step helps detect invalid ranges, unusual values and potential data quality issues.

Summary of Numerical Variables
Variable Mean SD Min Q25 Median Q75 Max Range NA_Count
targeted_productivity targeted_productivity 0.7296324 0.0978910 0.0700000 0.7000000 0.7500000 0.8000000 0.800000 7.30000e-01 0
smv smv 15.0621721 10.9432192 2.9000000 3.9400000 15.2600000 24.2600000 54.560000 5.16600e+01 0
wip wip 1190.4659913 1837.4550011 7.0000000 774.5000000 1039.0000000 1252.5000000 23122.000000 2.31150e+04 0
over_time over_time 4567.4603175 3348.8235629 0.0000000 1440.0000000 3960.0000000 6960.0000000 25920.000000 2.59200e+04 0
incentive incentive 38.2105263 160.1826428 0.0000000 0.0000000 0.0000000 50.0000000 3600.000000 3.60000e+03 0
idle_time idle_time 0.7301587 12.7097565 0.0000000 0.0000000 0.0000000 0.0000000 300.000000 3.00000e+02 0
idle_men idle_men 0.3692565 3.2689873 0.0000000 0.0000000 0.0000000 0.0000000 45.000000 4.50000e+01 0
no_of_style_change no_of_style_change 0.1503759 0.4278479 0.0000000 0.0000000 0.0000000 0.0000000 2.000000 2.00000e+00 0
no_of_workers no_of_workers 34.6098580 22.1976867 2.0000000 9.0000000 34.0000000 57.0000000 89.000000 8.70000e+01 0
actual_productivity actual_productivity 0.7350911 0.1744879 0.2337055 0.6503071 0.7733333 0.8502525 1.120437 8.86732e-01 0
Summary of Categorical Variables
Variable Unique_Values NA_Count
day Monday, Saturday, Sunday, Thursday, Tuesday, Wednesday 0
team 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 0
quarter Quarter1, Quarter2, Quarter3, Quarter4, Quarter5 0
date From 2015-01-01 to 2015-03-11 0

Based on the summaries above, we identify one inconsistency that need to be addressed. In particular, actual_productivity represents a proportion of achieved productivity relative to the target. However, some values exceed 1. Since a proportion above 1 is not meaningful, we cap all values greater than 1 at 1.

The table below shows the cleaned dataset after applying the necessary corrections.

Preview of the Cleaned Dataset
date quarter department day team targeted_productivity smv wip over_time incentive idle_time idle_men no_of_style_change no_of_workers actual_productivity
1 1/1/2015 Quarter1 sweing Thursday 8 0.8 26.16 1108 7080 98 0 0 0 59.0 0.9407254
3 1/1/2015 Quarter1 sweing Thursday 11 0.8 11.41 968 3660 50 0 0 0 30.5 0.8005705
4 1/1/2015 Quarter1 sweing Thursday 12 0.8 11.41 968 3660 50 0 0 0 30.5 0.8005705
5 1/1/2015 Quarter1 sweing Thursday 6 0.8 25.90 1170 1920 50 0 0 0 56.0 0.8003819
6 1/1/2015 Quarter1 sweing Thursday 7 0.8 25.90 984 6720 38 0 0 0 56.0 0.8001250

With these corrections applied, the dataset is now consistent and ready for further exploratory analysis.

2.1 Distribution Analysis: Histograms and Boxplots

2.1.1 Numerical Variables

We now explore the distribution of each variable. Histograms and density curves allow us to visually assess skewness and modality in the numerical features.

The histograms reveal several key insights:

  • Many variables such as incentive, over_time, wip, idle_men and idle_time are heavily right-skewed, indicating the presence of extreme values and long tails.
  • The variable actual_productivity displays a right-skewed distribution, with most values concentrated between 0.6 and 1.0.
  • Variables such as no_of_style_change and targeted_productivity exhibit discrete or concentrated distributions, with most observations clustered around a few values.
To further support our analysis, we now turn to boxplots, which provide a compact summary of the distribution while clearly highlighting outliers.
Outlier Summary and Statistics Without Outliers
Variable Outliers Min_wo Max_wo Mean_wo Median_wo SD_wo
actual_productivity 54 0.35 1.00 0.76 0.80 0.15
targeted_productivity 79 0.60 0.80 0.75 0.75 0.06
smv 0 2.90 54.56 15.06 15.26 10.94
no_of_workers 0 2.00 89.00 34.61 34.00 22.20
wip 22 103.00 1871.00 1012.37 1039.00 341.51
incentive 11 0.00 119.00 25.80 0.00 30.25
idle_time 18 0.00 0.00 0.00 0.00 0.00
no_of_style_change 147 0.00 0.00 0.00 0.00 0.00
over_time 1 0.00 15120.00 4549.61 3960.00 3292.74
idle_men 18 0.00 0.00 0.00 0.00 0.00

The boxplots above highlight the presence of several outliers across numerical variables. This is further quantified in the table above, which reports for each variable the number of outliers, as well as basic statistics after removing them.

We observe that:

  • Some variables, such as wip, incentive and especially over_time, exhibit notable reductions in standard deviation after outlier removal.
  • For others like smv and no_of_workers, the distribution is stable and free of outliers.
  • However, in the cases of idle_time, idle_men and no_of_style_change all values are zero, with outliers representing rare but meaningful events. Removing the outliers in these cases would discard the few informative values available, compromising the interpretation of these variables.

Thus:

  • We remove outliers from all numerical variables except the three mentioned above and we standardize them using z-scoring, which allows us to make coefficients in subsequent models comparable, stabilize numerical optimization and posterior estimation and maintain interpretability in terms of “standard deviation shifts”.
  • For idle_time, idle_men and no_of_style_change, we instead convert them into binary indicators to retain the information about whether a rare event occurred

2.1.2 Categorical Variables

The date variable contains individual daily entries, which may result in cluttered and less interpretable plots especially when visualizing time trends or group comparisons.
To improve readability and highlight higher-level temporal patterns, we aggregate by month. This allows:

  • easier identification of trends over time,
  • reduced noise from daily fluctuations,
  • clearer visual communication in summary plots.

This transformation is especially useful in exploratory data analysis and modeling, where over-detailed granularity may obscure rather than reveal insights.

The boxplots illustrate the distribution of several numerical variables across different days of the week. Overall, there are no strong day-specific patterns or notable shifts in central tendency for any of the variables. For instance, both actual_productivity and targeted_productivity exhibit relatively stable medians across all days, with symmetric distributions centered around zero, indicating consistent performance throughout the week. Similarly, variables such as incentive, over_time, smv and wip do not show clear variations by day, although some outliers are present, which is expected in standardized data. The variable no_of_workers appears particularly stable, with only minor fluctuations in its distribution across the week. In summary, the boxplots suggest that day of the week does not significantly affect the distribution of any of the key numerical variables in the dataset.

The boxplots display the distribution of standardized numerical variables across the five quarters in the dataset. Overall, most variables do not exhibit strong seasonal trends or significant shifts in central tendency over time. However, a few subtle patterns can be observed. For instance, actual_productivity appears slightly higher in Quarter 5, while over_time shows a lower median in Quarter 2. The variable incentive remains relatively stable across quarters with a consistently low median, although Quarter 5 displays a wider spread suggesting that higher incentive values were occasionally observed during this period, even if they were not the norm. Similarly, wip shows a slight upward shift in Quarters 4 and 5. The number of workers (no_of_workers) remains relatively stable across all quarters, although Quarter 3 seems to have a slightly higher median. Both smv and targeted_productivity maintain consistent distributions, with the latter appearing more concentrated in Quarters 1 and 2. In summary, all variables remain stable throughout the quarters, with only minor variations observed.

These boxplots display the distribution of standardized numerical variables across the months of January, February and March. While most variables remain relatively stable throughout the first quarter, a few interesting patterns begin to emerge. For instance, actual_productivity shows a slight increase in January, followed by a modest decrease in February and a reduction in variability by March. The variable over_time is noticeably higher and more dispersed in January, suggesting that the beginning of the year may involve longer or more irregular work hours, which then stabilize in the following months. The variable incentive shows a gradual decline from January to March, potentially reflecting adjustments in motivational strategies or seasonal changes in bonus structures. Meanwhile, wip and smv maintain fairly consistent distributions, with only minor fluctuations. The number of workers (no_of_workers) remains constant across months, highlighting workforce stability. Lastly, targeted_productivity appears stable in both median and spread; however, January exhibits a slightly higher median while maintaining similar variability—indicating potentially higher performance expectations at the start of the year. In summary, all variables remain relatively steady from month to month, with only subtle variations suggesting temporal operational dynamics.

The figure displays the distribution of standardized numerical variables across 12 different teams. Teams 1 to 4 exhibit higher-than-average actual_productivity, with medians above zero and limited variability, likely reflecting greater efficiency or experience. In contrast, teams 5 to 8 tend to underperform, especially teams 5 to 8 which show a concentration of negative outliers. Teams 9 to 12 slightly recover, though they still remain near the overall average and negative outliers.

Team size, as captured by no_of_workers, progressively declines from teams 1 to 12, with smaller groups concentrated in teams 6 and 12. This structural difference may influence productivity, as smaller teams could face greater workload variability or reduced task-sharing capacity.

The distribution of incentive is relatively stable, with medians around zero across all teams. However, teams 1 to 3 show greater variability, suggesting a more dynamic or performance-based bonus structure. Teams 4 to 8 have narrower ranges and slightly lower medians, while teams 9 to 12 display a modest increase in both central tendency and spread. These patterns may reflect incentive policies aligned with team roles or performance expectations, but the relationship with actual productivity remains unclear.

over_time is fairly consistent across teams, centered around zero, but team 6 stands out with both a lower median and reduced variability. Team 12 also shows lower variability, though with notable outliers. This pattern aligns with the no_of_workers distribution, as both teams have the smallest workforce among all, suggesting that reduced staffing may correspond to more stable or limited overtime dynamics.

The smv variable shows clearer separation: teams 1 to 5 tend to handle tasks with higher standard minute values, while teams 6 to 12 are associated with lower SMV, except for Team 7, pointing to simpler or faster operations. Teams 5 and 7 show greater spread, suggesting mixed task complexity and possibly more unstable productivity levels.

targeted_productivity appears generally stable across teams, with slightly higher targets assigned to teams 1 to 3. Teams 5 to 8 show the lowest median values, indicating more modest expectations, while team 12 stands out with a highly concentrated distribution and the highest overall target, suggesting a consistently ambitious goal-setting strategy for that group.

wip levels vary considerably across teams. Teams 1 to 3 exhibit higher medians and greater dispersion, particularly team 2, suggesting more frequent or variable accumulation of unfinished work. From team 4 onward, distributions become more compact and centered, indicating more consistent workflow management. However, teams 9 and 10 display slightly elevated medians again, possibly reflecting mild inefficiencies or task delays in those groups.

These two sets of boxplots compare the distributions of standardized numerical variables based on the idle_time_flag and the idle_men_flag. At a glance, the patterns they reveal are nearly identical, suggesting that both flags capture very similar aspects of the production process. In both cases, the presence of idle conditions (when the flag equals 1) is associated with a clear drop in actual_productivity, indicating that downtime—regardless of how it is defined—is consistently linked to reduced performance. Additionally, incentive, targeted_productivity, and wip tend to decrease under idle conditions, possibly reflecting slower operations or reduced worker engagement. Interestingly, both smv and over_time show a slight increase during idle periods. This could suggest that when downtime occurs, tasks become more fragmented or less efficient, requiring more time to complete and potentially involving more complex activities that disrupt the standard workflow. Given the high degree of similarity between the two flag-based comparisons, it would be reasonable to retain only one of them for further analysis. We recommend keeping the idle_time_flag, as it appears more interpretable and directly tied to temporal inefficiencies, making it a more informative and actionable indicator in the context of productivity modeling.

This set of boxplots illustrates the impact of the style_change_flag on the distribution of standardized numerical variables. The flag indicates whether a change in garment style occurred (1) or not (0), and its effect is clearly visible across multiple metrics. When a style change is present, both actual_productivity and targeted_productivity tend to decrease, suggesting that switching styles disrupts workflow and lowers both performance and expectations. This is accompanied by reduced values in incentive and wip, potentially reflecting a slowdown in production or temporary inefficiencies during the transition phase. Conversely, both over_time and smv increase when a style change occurs, which may indicate the need for process adjustments, machine reconfiguration, or increased task complexity. Additionally, no_of_workers becomes more concentrated at higher values, suggesting that greater labor input is required to manage the transition. Overall, the plots highlight that style changes are associated with a clear decline in productivity-related indicators and an increase in operational demands. These findings reinforce the idea that such transitions introduce inefficiencies and should be carefully managed in production planning to minimize disruption.

2.1.3 Preliminary Insight

This preliminary exploratory analysis provides initial insights into which variables might be relevant for predicting actual_productivity. Although not conclusive, the patterns observed so far allow us to distinguish between potentially informative and less impactful predictors—both numerical and categorical. These early observations serve as a foundation for guiding future feature selection and model refinement.

2.1.3.1 Promising Predictors

Categorical Variables:

  • team: Clear and consistent differences in actual_productivity across teams suggest that team is a highly informative variable. Given the organizational structure, it is also a strong candidate for modeling as a hierarchical (random) effect, capturing group-level variability in a mixed-effects or Bayesian framework.
  • style_change_flag: Productivity significantly decreases when a style change occurs, indicating operational disruption. This variable captures transition effects that can directly impact workflow efficiency.
  • idle_time_flag: Associated with a visible drop in productivity, this flag reflects temporal inefficiencies. It is both interpretable and impactful, making it more useful than the redundant Idle Men Flag.

Numerical Variables:

  • incentive and targeted_productivity: Lower incentiveand targeted_productivity levels are often associated with lower productivity, possibly reflecting decreased motivation or fewer output-based rewards.
  • over_time and smv: Productivity tends to be higher for tasks with lower over_time and smv , implying that task complexity or duration may inversely relate to output.
  • wip: Although not a strong standalone predictor, wip shows partial alignment with productivity drops during idle periods, style transitions, and across different teams. However, the relationship is not clearly defined. Its predictive value may emerge more clearly when combined with contextual variables, suggesting that wip could serve as a useful interaction term in more complex modeling frameworks.

2.1.3.2 Less Informative Predictors

Categorical Variables:

  • day, month, quarter: Time-related groupings do not show clear or systematic differences in actual_productivity, suggesting that temporal calendar effects have limited impact in this setting.
  • idle_men_flag: This flag replicates the behavior of the idle_time_flag almost identically, providing no additional information and can likely be excluded.

Numerical Variables:

  • no_of_workers: This variable is relatively stable across all groupings and does not show a clear relationship with productivity. It may still be useful in capacity-related analyses but appears less predictive in this context.

2.1.4 Normal Quantile Transformation of the Target Variable

After removing outliers and applying z-scoring, the distribution of the target variable actual_productivity still exhibits evident asymmetry and departure from normality, as illustrated in the left panel below. These characteristics may adversely impact the robustness, convergence and interpretability of subsequent models, particularly those that rely on assumptions of standardized coefficients for inference.

To mitigate these issues, we apply a Normal Quantile Transformation (NQT), also known as a rank-based inverse normal transformation, to the target variable. This transformation:

  • Preserves the ordinal structure of the data (i.e., the relative ranking of observations),
  • Maps each empirical quantile to the corresponding quantile of a standard normal distribution \(\mathcal{N}(0, 1)\),
  • Reduces skewness and stabilizes variance,
  • Makes the distribution more compatible with linear and Bayesian modeling frameworks.

2.1.4.1 Mathematical Definition

Let \(x_1, x_2, \dots, x_n\) be the observed values of the target variable. The transformation proceeds as follows:

  1. Ranking the observations:
    Each value \(x_i\) is assigned a rank \(R_i \in \{1, \dots, n\}\), based on its position in the sorted sequence of all observations. In the presence of ties, average ranks are used.

  2. Estimating cumulative probabilities:
    The empirical cumulative probability associated with each observation is computed using the formula: \[ p_i = \frac{R_i - 0.5}{n} \] This yields strictly non-extreme values in the interval \((0, 1)\), avoiding the undefined boundaries of the normal quantile function at 0 and 1. This value approximates the probability that a randomly drawn observation from the empirical distribution is less than or equal to \(x_i\).

  3. Applying the normal quantile function:
    Each cumulative probability \(p_i\) is transformed into a value \(z_i\) from the standard normal distribution via: \[ z_i = \Phi^{-1}(p_i) \] where \(\Phi^{-1}\) denotes the inverse cumulative distribution function (also called the quantile function or probit) of the standard normal distribution.

As a result, the transformed variable \(z_i\) is approximately distributed as \(\mathcal{N}(0, 1)\), while preserving the original rank ordering of the data: if \(x_i < x_j\), then \(z_i < z_j\) after transformation. This means the order of quantiles is maintained, allowing for monotonic transformations without distorting the relative structure of the data. This rank-based normalization is especially useful when the original variable exhibits non-Gaussian features such as skewness or heavy tails and it enhances the suitability of the target for use in modeling frameworks that benefit from regularized and symmetric response variables.

2.2 Linear Relationships

This section explores the strength and structure of linear associations between the predictors and the transformed target variable actual_productivity_nqt. We begin with a visual analysis through scatterplots, followed by a correlation matrix and conclude by assessing multicollinearity through Variance Inflation Factors (VIF).

2.2.1 Scatterplots of Predictors

We start by exploring pairwise linear associations between actual_productivity_nqt and the numerical predictors using scatterplots. This helps assess potential linear trends, dispersion patterns and deviations from model assumptions.

2.2.1.2 Team-Level Variation Across Predictors

Beyond global trends, the scatterplots highlight meaningful differences in how individual teams distribute themselves across predictor values, revealing operational diversity and contextual nuances.

  • incentive: All teams exhibit variation in incentive policies, with Team 11 (pink) particularly standing out for its broad spread. Teams from 5 to 9 show a more clustered presence at lower incentive values, yet still align with the global positive trend. This underscores the universal relevance of incentive as a productivity driver, regardless of team.

  • no_of_workers: Teams such as 1, 2 and 3 display moderate variability in workforce size, whereas others like Teams 7 and 12 operate consistently within a narrow band. However, across all teams, productivity levels vary substantially even at fixed workforce sizes, confirming that no_of_workers lacks explanatory power both globally and within teams.

  • over_time: All teams span a wide range of overtime hours, particularly Teams 1 to 3, which include several extreme high outliers. Despite this, no team deviates from the overall absence of a trend, indicating that over_time does not meaningfully differentiate team behavior in terms of productivity.

  • smv: Teams encounter varying task complexities, with Teams 11 and 12 spanning a wider range. The generally negative trend remains intact across teams, suggesting that the complexity–productivity trade-off is mildly observed regardless of team structure.

  • targeted_productivity: Teams differ significantly in their operational targets. Teams 2 and 5 are concentrated in lower bands, while Teams 1, 3 and 4 operate with higher targets and correspondingly higher actual productivity. This tight alignment between target and outcome across most teams suggests effective performance alignment and supports the use of targeted_productivity as a structured and meaningful predictor.

  • wip: Work-in-progress values vary across teams, with Teams 1 to 3 showing the highest levels. These teams also exhibit higher productivity, reinforcing the moderate positive trend observed globally. While no team deviates drastically, the spread in WIP values reflects underlying heterogeneity in operational tempo or workflow organization.

2.2.1.3 Summary

The joint analysis of global patterns and team-level behaviors confirms the critical role of incentive and targeted_productivity as robust linear predictors. wip provides additional signal, particularly when accounting for team context. In contrast, smv exhibits only weak linearity but may retain importance under more flexible modeling strategies. Finally, both over_time and no_of_workers lack sufficient variability and association with the target to justify inclusion in predictive models based on these scatterplots alone.

2.2.2 Correlation Matrix

To further evaluate the linear relationships between numerical variables and detect potential collinearity issues, we now compute and visualize the Pearson correlation matrix.

As expected, targeted_productivity is strongly correlated with the actual productivity achieved (correlation = 0.66), confirming that planned targets and real outcomes tend to move together though not perfectly. This relationship reinforces its importance as a key predictor. Another variable that shows a meaningful correlation with the target is incentive (0.83). This suggests that higher incentives are generally associated with higher productivity, possibly reflecting motivation-driven performance gains.

Interestingly, as observed in the previous step, wip shows a moderate positive correlation (0.25) with actual_productivity_nqt. This indicates that higher levels of work in progress may be associated with greater output, perhaps reflecting a more active or efficient workflow. While this is not a strong linear correlation, it’s sufficient to justify keeping wip in consideration for modeling, especially in interaction with operational conditions.

On the other hand, all other variables exhibit very weak or negligible linear correlations with the target:

  • over_time shows a near-zero correlation (−0.02) with actual_productivity_nqt, suggesting that the amount of overtime worked does not translate into higher or lower productivity in a consistent linear way;
  • smv (−0.15) has a slightly negative correlation with the target, indicating that tasks with higher standard minute values, typically more complex or time-consuming, tend to be associated with lower productivity;
  • no_of_workers is almost entirely uncorrelated with the target (0.03), implying that variations in team size have minimal linear impact on individual productivity levels.

Looking beyond the target variable, some internal correlations among predictors are worth noting:

  • smv has a strong positive correlation with no_of_workers (0.60), indicating that tasks with higher standard times tend to involve more workers, possibly due to complexity or workload distribution;
  • over_time is moderately correlated with both smv (0.29) and no_of_workers (0.35), suggesting that time-consuming or more resource-intensive tasks lead to extended working hours.
  • Notably, incentive is not strongly correlated with any other predictor, which may make it valuable in models by adding independent information.

In summary, targeted_productivity and incentive emerge as the most relevant linear predictors for actual_productivity_nqt followed by wip, which shows a moderate but non-negligible association. Other variables appear less directly related to the target in a linear sense, but they may still contribute meaningfully through non-linear relationships or interactions with other variables. These insights help refine the set of candidate predictors for subsequent modeling phases and highlight the importance of exploring non-linear effects and interaction terms in future models.

2.2.3 Variance Inflation Factor (VIF)

To assess multicollinearity among the most relevant linear or potentially linear predictors, we computed the Variance Inflation Factor (VIF) for targeted_productivity, incentive and wip.

These predictors were selected based on the results of the correlation matrix and scatterplots. Both incentive and targeted_productivity exhibit strong and clearly linear associations with actual_productivity_nqt. While wip shows a noisier and more dispersed pattern, its moderate positive trend suggests that it may still behave linearly to some extent and is therefore included in this diagnostic.

All values are well below the commonly used threshold of 5, which suggests that multicollinearity is not a concern among these variables.

Specifically:

  • incentive has the highest VIF, slightly above 1.5, indicating only minimal redundancy with the other two predictors.
  • targeted_productivity shows a similarly low VIF value, reinforcing its independence in the presence of incentive and wip.
  • wip has the lowest VIF, confirming that it contributes unique information to the model and is not strongly collinear with the others.

These results validate the joint use of these three predictors in a linear model. No transformation or exclusion is required at this stage due to collinearity, allowing us to proceed confidently with model fitting and, in the next section, explore potential non-linear effects with more flexible models such as GAMMs.

2.3 Non-linear Modeling with Generalized Additive Mixed Models (GAMM)

Building on the previous analysis of linear associations, we now explore non-linear effects using a Generalized Additive Mixed Model (GAMM). This approach allows for flexible modeling of complex relationships between predictors and the target variable actual_productivity_nqt, using smooth functions for continuous variables and accounting for team-level heterogeneity through random effects.

Unlike classical linear models, GAMMs do not assume linearity across the full range of predictor values. Instead, they estimate non-linear patterns directly from the data using basis functions that capture smooth transitions, making GAMMs particularly suitable in the presence of curved or threshold effects, as observed in earlier scatterplots and residual diagnostics.

2.3.1 Spline-Based Smooth Terms

In this model, we use cubic regression splines to model the non-linear relationships of continuous covariates. A cubic spline is a piecewise polynomial of degree three that is smooth and continuous at predefined points called knots. Between each pair of knots, the function behaves like a cubic polynomial, while at the knots themselves, continuity and smoothness constraints ensure a coherent global shape. This provides enough flexibility to model a wide range of non-linear patterns without overfitting.

Cubic splines are particularly effective in GAMMs because they allow the model to adapt locally to the shape of the data while avoiding the instability that might arise from high-degree global polynomials. The smoothness of each spline is automatically optimized using penalized likelihood, balancing model fit and complexity.

2.3.2 Model Specification

The model incorporates:

  • Linear terms: incentive and targeted_productivity, selected based on their previously confirmed linear association with the response.
  • Smooth spline terms: wip, over_time, smv and no_of_workers, modeled with cubic regression splines to capture potential non-linear effects.
  • Fixed effects: day, month, quarter, style_change_flag and idle_time_flag, included to control for structured seasonal and operational factors.
  • Random effect: team, modeled using s(team, bs = "re"), which introduces a random intercept for each team. This accounts for unobserved group-level heterogeneity, enabling the model to separate team-specific baselines from the global effects of predictors.

This flexible structure allows us to evaluate both the shape and the significance of each predictor’s contribution, laying the groundwork for model refinement and feature selection in subsequent steps.

# Prepare dataset
data_model <- data %>%
  mutate(across(c(quarter, day, month, team, style_change_flag, idle_time_flag), as.factor)) %>%
  select(actual_productivity_nqt, no_of_workers, incentive, over_time, wip, smv,
         targeted_productivity, quarter, day, month, style_change_flag, idle_time_flag, team)

# Fit the GAMM
model_gamm <- gam(actual_productivity_nqt ~  
                    incentive +
                    targeted_productivity + 
                    s(wip) + 
                    s(over_time) + 
                    s(smv) +
                    s(no_of_workers) +
                    s(team, bs = "re") +
                    day + month + quarter +
                    style_change_flag + 
                    idle_time_flag,
                  data = data_model, method = "REML")

2.3.3 Summary and Smooth Terms

## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## actual_productivity_nqt ~ incentive + targeted_productivity + 
##     s(wip) + s(over_time) + s(smv) + s(no_of_workers) + s(team, 
##     bs = "re") + day + month + quarter + style_change_flag + 
##     idle_time_flag
## 
## Parametric coefficients:
##                       Estimate Std. Error t value Pr(>|t|)    
## (Intercept)           -0.45778    0.06226  -7.352 6.78e-13 ***
## incentive              0.59494    0.02317  25.673  < 2e-16 ***
## targeted_productivity  0.23665    0.01855  12.760  < 2e-16 ***
## daySaturday           -0.03523    0.05187  -0.679   0.4972    
## daySunday             -0.04954    0.04880  -1.015   0.3105    
## dayThursday           -0.02951    0.05099  -0.579   0.5630    
## dayTuesday            -0.01522    0.04928  -0.309   0.7576    
## dayWednesday           0.05716    0.04949   1.155   0.2485    
## monthFebruary          0.08886    0.04589   1.936   0.0533 .  
## monthMarch             0.03179    0.05343   0.595   0.5520    
## quarterQuarter2       -0.01430    0.03867  -0.370   0.7116    
## quarterQuarter3       -0.01156    0.04685  -0.247   0.8051    
## quarterQuarter4       -0.05276    0.04851  -1.088   0.2772    
## quarterQuarter5        0.05373    0.08955   0.600   0.5487    
## style_change_flag1    -0.02818    0.05349  -0.527   0.5985    
## idle_time_flag1       -0.47711    0.11343  -4.206 3.01e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##                    edf Ref.df      F p-value    
## s(wip)           2.301  2.940  2.062 0.09476 .  
## s(over_time)     1.000  1.001 14.847 0.00013 ***
## s(smv)           2.424  2.979 19.064 < 2e-16 ***
## s(no_of_workers) 3.202  3.899  3.456 0.00985 ** 
## s(team)          8.821 11.000  4.261 < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.819   Deviance explained = 82.9%
## -REML = 266.67  Scale est. = 0.11772   n = 606
Estimated smooth terms in the GAMM model with random intercept for team.
edf Ref.df F p-value
s(wip) 2.301 2.940 2.062 0.095
s(over_time) 1.000 1.001 14.847 0.000
s(smv) 2.424 2.979 19.064 0.000
s(no_of_workers) 3.202 3.899 3.456 0.010
s(team) 8.821 11.000 4.261 0.000

The model yields strong performance with an adjusted R² of 0.819 and 82.9% deviance explained.

2.3.4 Main Findings

  • incentive and targeted_productivity are highly significant linear terms (p < 2e-16), confirming their central role in explaining productivity variations.

  • idle_time_flag is also highly significant (p = 3.01e-05), with a negative effect on productivity, aligning with expectations from earlier exploratory analysis.

  • s(over_time) and s(smv) are both strongly significant smooth terms, indicating clear non-linear patterns:

    • Higher smv values lead to substantial productivity loss;
    • over_time shows a diminishing return or penalizing effect.
  • s(no_of_workers) is statistically significant (p = 0.00985), suggesting a possible non-linear influence worth deeper inspection. However, the estimated effect appears unstable and poorly defined, with wide confidence intervals and no clear interpretability across most of the observed range. For this reason, its practical contribution to the model is questionable, and the term may be excluded in favor of parsimony.

  • s(wip) is marginally significant (p = 0.09476), despite showing some curvature (edf = 2.29). This suggests that wip may not have a strong standalone effect but could still interact with other variables like incentive or s(smv).

  • The random intercept for team is significant (p < 2e-16), confirming that a portion of the variability in productivity is attributable to team-level differences.

  • All other categorical variables (day, month, quarter and style_change_flag) are not statistically significant. This reinforces earlier findings that calendar-based effects are negligible once operational and team-level variables are included, and also highlights the negligibility of event-level effects.

Overall, the GAMM confirms the conclusions drawn from prior linear and exploratory analyses:

  • We should retain incentive, targeted_productivity, idle_time_flag, smv and over_time.
  • wip may be excluded as a standalone term but remains a candidate for interaction modeling.
  • team should be modeled hierarchically, as group-level effects are relevant.
  • Temporal and structural factors like day, month, quarter and style_change_flag do not contribute meaningfully and can be removed in simplified versions of the model.

2.3.5 QQ-Plot and Residuals vs Fitted

To evaluate the adequacy of our Generalized Additive Mixed Model (GAMM), we inspect two key diagnostic plots: the QQ-plot of residuals and the residuals vs fitted values plot. The QQ-plot assesses whether residuals follow a normal distribution, a key assumption for many inferential procedures such as confidence intervals and p-values. The residuals vs fitted plot is used to detect model misspecification, such as non-linearity not accounted for or non-constant variance.

The QQ-plot reveals that residuals align reasonably well with the theoretical normal distribution in the center but show substantial deviations in the tails. This pattern indicates the presence of heavy tails, suggesting that a normal distribution may not adequately capture the behavior of extreme residuals.

By contrast, the residuals vs fitted plot displays a random scatter around zero, with no apparent funnel shape or structured pattern. This confirms that the GAMM adequately models the underlying functional relationships and variance structure, without strong evidence of heteroskedasticity or omitted nonlinearities.

Together, these diagnostics imply that while the GAMM is well specified, the assumption of normal residuals may be too restrictive. Thus, it would be reasonable to consider a t-distribution for residuals in a Bayesian framework. This would provide greater robustness to outliers by limiting their influence on parameter estimates.

It is important to note, however, that p-values reported by GAMMs are only approximate, relying on normality assumptions and asymptotic theory. While useful for preliminary variable selection, they should not be over-interpreted.

A more robust approach involves using these p-values and diagnostics to guide predictor selection and transformation, followed by Bayesian estimation with a t-distributed likelihood. This ensures principled inference through credible intervals and improved handling of outliers and heavy-tailed residuals. It is precisely the approach we will adopt in the Bayesian models introduced in the next section.

3 Bayesian Modeling

In this section, we develop a set of hierarchical Bayesian models to analyze and explain the factors influencing worker productivity in the garment sector. Given the complexity and nested nature of the data, where individual observations are clustered within production teams, we adopt a modeling framework that allows for both global effects and team-specific variability.

The Bayesian paradigm offers several advantages in this context: it enables the incorporation of prior knowledge, provides full posterior distributions for uncertainty quantification and handles hierarchical structures in a principled and coherent way. We begin with a relatively simple centered hierarchical model, that already includes non-linear effects through spline bases, and then progressively increase model complexity by testing different interaction structures and modifying the hierarchical specification. We explore both centered and non-centered parameterizations for the random team effects, which allows us to assess how reparameterization impacts convergence, identifiability and model fit.

Through a combination of convergence diagnostics, posterior inference and model comparison criteria such as the Deviance Information Criterion (DIC) and posterior predictive checks, we systematically evaluate the performance of each model. This stepwise approach highlights how each component contributes to the model’s ability to capture the key drivers of productivity and ultimately guides us toward a flexible and interpretable specification that best fits the observed data.

3.1 Model 1: Centered Hierarchical Additive Model

This first Bayesian model serves as the baseline and was constructed by integrating the insights gained from both EDA and the results of the GAMMs. The model is designed to combine linear effects, smooth nonlinear effects and a hierarchical structure that captures variability across teams.

The response variable \(y_i\), which represents actual_productivity_nqt, is assumed to follow a Student-t distribution:

\[ y_i \sim \text{Student-t}(\mu_i, \tau^{-1}, \nu) \]

We use this distribution instead of a normal distribution because it is more robust to extreme values. The degrees of freedom \(\nu\), which control the heaviness of the tails, are estimated from the data using an exponential prior:

\[ \nu \sim \text{Exponential}(1) \]

This choice allows the model to flexibly adapt to the presence of outliers: smaller values of \(\nu\) imply heavier tails and greater robustness, while larger values approximate a normal distribution. The prior favors moderate to low values, letting the data inform the appropriate tail behavior.

The mean of the response \(y_i\) is modeled as the sum of several components as follows

\[ \mu_i = \beta_0 + \mathbf{X}_i^\top \boldsymbol{\beta}_X + \mathbf{Z}_{1,i}^\top \boldsymbol{\beta}_1 + \mathbf{Z}_{2,i}^\top \boldsymbol{\beta}_2 + (u_{j[i]} - \bar{u}), \]

where:

  • \(\beta_0\) is the overall intercept;
  • \(\mathbf{X}_i\) contains the linear predictors targeted_productivity, incentive and idle_time_flag;
  • \(\mathbf{Z}_{1,i}\) and \(\mathbf{Z}_{2,i}\) contain spline basis functions for over_time and smv. These variables showed nonlinear patterns in EDA, which we represent using cubic regression splines;
  • \(u_{j[i]}\) is a team-specific intercept that accounts for unobserved differences between teams. We subtract the mean \(\bar{u}\) to ensure that the group effects are centered and do not absorb the global intercept.

We chose to use orthogonal spline bases for the nonlinear components, because it helps reduce multicollinearity between the basis functions and improves the convergence of the Markov Chain Monte Carlo (MCMC) algorithm. Orthogonal bases ensure that each smooth term contributes uniquely to the model, making parameter estimation more stable and efficient.

All regression coefficients are assigned standard normal priors:

\[ \beta \sim \mathcal{N}(0, 1) \]

These priors are weakly informative: they provide regularization, which helps prevent overfitting, but they are not overly restrictive.

For the standard deviation of the residuals, we use:

\[ \sigma \sim \text{Uniform}(0, 100) \]

The team-level effects \(u_j\) are modeled with a normal distribution:

\[ u_j \sim \mathcal{N}(0, \tau_{\text{team}}^{-1}) \]

The prior for the team-level standard deviation \(\text{sd_team}\) is a half-Student-t distribution:

\[ \text{sd_team} \sim \text{Student-}t^+(0, 1, 1) \]

This prior is commonly used in hierarchical models because it allows for flexible group-level variation while providing some shrinkage, which improves stability especially when the number of groups is small.

In summary, this model provides a flexible and interpretable structure that balances robustness, generalization and computational stability. It captures the most relevant features of the data and serves as a solid starting point for more complex or specialized models.

3.1.1 Model Definition and Setup

We now provide the complete definition of the Bayesian model and the corresponding data structures used for inference. The model is specified using JAGS syntax and reflects the formulation previously discussed in the theoretical description. We also define the dataset structure passed to JAGS, ensuring that the design matrices for the fixed and smooth components are constructed using orthogonalized spline bases. This improves identifiability and MCMC efficiency by reducing collinearity between predictors. Initial values for the Markov chains are defined in a way that supports model convergence, with small random perturbations around the empirical mean for each parameter. In particular, the team-level effects are initialized to have mean zero to preserve identifiability of the overall intercept.

set.seed(123)
model_string <- "
model {
  for (i in 1:N) {
    mu[i] <- beta0 +
             inprod(betaX[], X[i,]) +
             inprod(beta1[], Z1[i,]) +
             inprod(beta2[], Z2[i,]) +
             (u[team[i]] - u_mean)
    
    y[i] ~ dt(mu[i], tau, nu)
    y_rep[i] ~ dt(mu[i], tau, nu)
  }

  sigma ~ dunif(0, 100)
  tau <- pow(sigma, -2)
  nu ~ dexp(1)

  beta0 ~ dnorm(0, 1)

  for (j in 1:KX) {
    betaX[j] ~ dnorm(0, 1)
  }
  for (j in 1:K1) {
    beta1[j] ~ dnorm(0, 1)
  }
  for (j in 1:K2) {
    beta2[j] ~ dnorm(0, 1)
  }

  for (j in 1:J_team) {
    u[j] ~ dnorm(0, tau_team)
  }
  sd_team ~ dt(0, 1, 1) T(0,)
  tau_team <- pow(sd_team, -2)

  u_mean <- mean(u[])
}
"
writeLines(model_string, con = "model.jags")
X <- model.matrix(~ 0 + targeted_productivity + incentive + idle_time_flag, data = data)
Z_overtime <- smoothCon(s(over_time, bs = "cr", k = 3), data = data, absorb.cons = TRUE)[[1]]$X
Z_smv <- smoothCon(s(smv, bs = "cr", k = 3), data = data, absorb.cons = TRUE)[[1]]$X

jags_data <- list(
  y = as.vector(data$actual_productivity_nqt),
  X = X,
  Z1 = Z_overtime,
  Z2 = Z_smv,
  team = as.numeric(as.factor(data$team)),
  N = nrow(data),
  J_team = length(unique(data$team)),
  KX = ncol(X),
  K1 = ncol(Z_overtime),
  K2 = ncol(Z_smv)
)
inits <- function() {
  beta0_init <- mean(jags_data$y)
  u_init <- rnorm(jags_data$J_team, 0, 0.1)
  u_centered <- u_init - mean(u_init)

  list(
    beta0 = beta0_init,
    betaX = rnorm(jags_data$KX, 0, 0.1),
    beta1 = rnorm(jags_data$K1, 0, 0.1),
    beta2 = rnorm(jags_data$K2, 0, 0.1),
    u = u_centered,
    sigma = runif(1, 0.1, 1),
    sd_team = runif(1, 0.1, 1),
    nu = runif(1, 2, 10)
  )
}

3.1.2 Parameter Selection and Model Sampling

We define the parameters to monitor during the MCMC sampling. Diagnostic parameters include all regression coefficients, the random effect standard deviation and the group-level effects. In addition, we sample replicated responses y_rep for posterior predictive checking and request the calculation of the Deviance Information Criterion (DIC) for model evaluation.

params_diag <- c("betaX", "beta0", "beta1", "beta2", "u", "sd_team")
params_ppc <- c("y_rep")

jags_model <- jags(
  data = jags_data,
  inits = inits,
  parameters.to.save = c(params_diag, params_ppc, "deviance"),
  model.file = "model.jags",
  n.chains = 3,
  n.iter = 10000,
  n.burnin = 2000,
  n.thin = 1,
  DIC = TRUE
)

samples_diag <- as.mcmc(jags_model)

To ensure reproducibility and speed up report rendering, we load the precomputed posterior samples and model fit object from disk.

samples_diag <- readRDS("samples_diag_simple.rds")
jags_model <- readRDS("jags_model_simple.rds")

3.1.3 Posterior Predictive Check (PPC)

We now evaluate how well the model replicates the observed data by comparing the distribution of observed outcomes to the distribution of simulated responses drawn from the posterior predictive distribution.

The posterior predictive check reveals a generally good agreement between the observed and predicted distributions of normalized actual productivity. The central bulk of the distribution, particularly around the mode near zero, is well captured by the model, suggesting that the main trends in the data are adequately represented.

However, we observe two minor discrepancies:

  • The predicted distribution appears slightly smoother and more peaked near the central mode, potentially indicating some degree of over-regularization.
  • The left tail (around -2) is underestimated by the model, suggesting that some extreme low-productivity values are not fully captured.
  • The right tail is slightly underpredicted, though this deviation is relatively minor.

Overall, the model provides a satisfactory fit, with the posterior predictive distribution broadly capturing the shape and spread of the observed data. These results support the adequacy of the model structure and prior specification, though minor tail mismatches could potentially be addressed in future refinements.

3.1.4 Model Fit: Deviance Information Criterion (DIC)

We compute the DIC as a measure of model fit penalized by complexity.

## DIC value: 346.0149

3.1.5 Autocorrelation Diagnostics

The autocorrelation plots displayed below provide insight into the sampling efficiency of the MCMC algorithm for each parameter. For readability, we report only the first 30 lags, which are generally sufficient to assess short-term dependence in the chains.

Across all monitored parameters, autocorrelation declines rapidly toward zero. This pattern indicates good mixing behavior and low correlation between consecutive samples. Overall, the MCMC chains exhibit desirable autocorrelation properties, which supports the reliability of posterior summaries and the adequacy of the chosen priors and model specification.

3.1.6 Traceplots of Parameters and Posterior Density Overlays

We inspect traceplots of the MCMC chains to ensure proper mixing and stationarity. Well-behaved chains should explore the posterior distribution thoroughly without visible drift or trends. In this case, all traceplots display stable horizontal bands with no visible trends or drifts, which is a strong indication of convergence. The chains for each parameter show rapid oscillation around a constant mean. Overall, the traceplots provide reassuring evidence that the chains have reached stationarity and are adequately sampling from the posterior distributions.

To further support our convergence assessment, we visualize the posterior distributions of each parameter across the MCMC chains. The high degree of overlap among the density curves confirms the consistency of the chains and provides additional evidence of proper convergence in this model.

3.1.7 Inference and Parameter Interpretation

The table below presents summary statistics for all estimated parameters, including posterior means, standard deviations, credible intervals and diagnostic metrics such as R-hat and effective sample sizes.

Posterior Summary
variable mean median sd mad mcse_mean mcse_sd rhat ess_bulk ess_tail 2.5% 25% 50% 97.5%
beta0 -0.470 -0.470 0.020 0.021 0.001 0.000 1.007 1627.161 3717.687 -0.510 -0.484 -0.470 -0.430
beta1[1] -0.129 -0.128 0.049 0.049 0.001 0.000 1.000 7610.198 11778.713 -0.226 -0.161 -0.128 -0.034
beta1[2] -0.241 -0.239 0.070 0.070 0.001 0.001 1.001 5510.813 8032.367 -0.381 -0.287 -0.239 -0.107
beta2[1] 0.061 0.061 0.049 0.048 0.001 0.000 1.000 2316.974 4733.202 -0.036 0.029 0.061 0.155
beta2[2] -0.981 -0.979 0.116 0.115 0.001 0.001 1.001 6904.462 12757.937 -1.210 -1.058 -0.979 -0.756
betaX[1] 0.248 0.248 0.014 0.013 0.000 0.000 1.003 2933.938 6799.576 0.221 0.239 0.248 0.275
betaX[2] 0.599 0.599 0.023 0.023 0.001 0.000 1.007 1530.500 3263.046 0.553 0.583 0.599 0.643
betaX[3] -0.372 -0.366 0.157 0.155 0.001 0.001 1.000 12557.445 12169.283 -0.702 -0.473 -0.366 -0.080
sd_team 0.120 0.113 0.038 0.032 0.001 0.002 1.001 3122.536 5665.926 0.064 0.094 0.113 0.210
u[1] 0.094 0.092 0.053 0.052 0.001 0.001 1.001 2460.838 3889.824 -0.006 0.058 0.092 0.202
u[10] -0.062 -0.061 0.049 0.047 0.001 0.001 1.001 2209.300 3611.534 -0.160 -0.093 -0.061 0.034
u[11] -0.136 -0.134 0.061 0.061 0.001 0.001 1.001 1765.849 2872.654 -0.263 -0.176 -0.134 -0.023
u[12] -0.117 -0.115 0.056 0.054 0.001 0.001 1.001 1732.252 2358.106 -0.236 -0.152 -0.115 -0.014
u[2] 0.155 0.153 0.059 0.058 0.001 0.001 1.001 3075.093 4917.095 0.047 0.115 0.153 0.277
u[3] -0.052 -0.053 0.047 0.046 0.001 0.001 1.001 2101.652 3334.018 -0.144 -0.083 -0.053 0.042
u[4] 0.056 0.055 0.049 0.048 0.001 0.001 1.001 2227.494 4094.144 -0.039 0.023 0.055 0.154
u[5] 0.000 0.000 0.051 0.050 0.001 0.001 1.001 2422.792 4000.558 -0.099 -0.033 0.000 0.104
u[6] -0.057 -0.055 0.051 0.051 0.001 0.001 1.002 1695.403 2725.233 -0.163 -0.090 -0.055 0.039
u[7] 0.106 0.104 0.053 0.052 0.001 0.001 1.001 2629.754 4557.211 0.008 0.070 0.104 0.213
u[8] 0.100 0.098 0.054 0.053 0.001 0.001 1.001 2640.975 4615.473 -0.004 0.063 0.098 0.211
u[9] -0.088 -0.088 0.048 0.047 0.001 0.001 1.001 2079.430 3466.754 -0.187 -0.119 -0.088 0.005

All parameters exhibit excellent convergence diagnostics: R-hat values are uniformly close to 1.000 and effective sample sizes largely exceed 1000, ensuring reliable and stable posterior estimates.

The intercept (beta0) is significantly negative (mean = –0.47, 95% CI: [–0.51, –0.43]) and, due to the centering of the team-level effects, represents the average baseline productivity across all teams after normalization. Among the linear predictors, betaX[2] (incentive) shows a strong positive effect on productivity (mean = 0.599, CI: [0.553, 0.643]) and betaX[1] (likely targeted productivity) also exhibits a positive association (mean = 0.248, CI: [0.221, 0.275]). In contrast, betaX[3] (idle time flag) has a significant negative effect (mean = –0.372, CI: [–0.702, –0.080]), indicating that flagged idle time is associated with reduced productivity.

Spline coefficients for over_time (beta1) and smv (beta2) capture flexible, non-linear effects. These coefficients are not directly interpretable individually, but their magnitude and joint significance indicate how much and where the estimated function deviates from linearity. While individual spline terms are not directly interpretable, some coefficients (notably beta2[2] (mean ≈ –0.981), beta1[1] and beta1[2]) have credible intervals excluding zero, supporting the presence of non-linear relationships with productivity.

The estimated standard deviation of the team-level effects (sd_team) is around 0.12 and credibly positive, revealing moderate variability between teams. In this model, team-level intercepts u[j] have been centered, meaning each team effect is expressed as a deviation from the mean team productivity. This reparameterization improves model identifiability and allows the global intercept to absorb the average effect. Most u[j] values have intervals including zero, suggesting that many teams do not differ substantially from the overall mean. However, u[2], u[7], u[11] and u[12] have credible intervals that exclude zero, indicating that these teams exhibit statistically distinct behavior relative to the average.

The parameters with credible intervals that exclude zero and are therefore considered statistically relevant include beta0, betaX[1], betaX[2], betaX[3], beta1[1], beta1[2], beta2[2], sd_team and the team effects u[2], u[7], u[11], u[12]. These contribute most substantially to explaining variation in productivity. For the remaining parameters, credible intervals contain zero, reflecting greater posterior uncertainty about their individual impact, though they may still contribute to capturing complex patterns or improving regularization.

3.2 Model 2: Non-Centered Hierarchical Model with Spline Interaction Incentive × WIP

Guided by our exploratory data analysis, we found strong evidence that the variable wip may play an important nonlinear role in determining productivity, especially through its interaction with incentive and possible relation with smv. While incentive was previously modeled linearly and appeared statistically significant, we now test for its nonlinear effects and interactions. We therefore propose a model that includes both linear and smooth terms and a flexible interaction between incentive and wip.

The outcome variable \(y_i\) is once again modeled using a Student-t likelihood, improving robustness against outliers by allowing heavier tails. Specifically: \[ y_i \sim \text{Student-t}(\mu_i, \sigma, \nu) \]

The linear predictor \(\mu_i\) is composed as: \[ \mu_i = X_i^\top \beta_X + Z_{1,i}^\top \beta_1 + Z_{2,i}^\top \beta_2 + Z_{3,i}^\top \beta_3 + u_{\text{j}[i]} \] where:

  • \(\beta_X\) are linear coefficients for targeted_productivity and idle_time_flag, including also the intercept;
  • \(\beta_1\), \(\beta_2\) and \(\beta_3\) are spline coefficients for over_time, smv and the nonlinear interaction between incentive and wip, respectively;
  • \(u_{\text{team}[i]}\) is the team-level random effect.

Smooth terms are constructed using orthogonalized splines to ensure identifiability and better sampling performance.

The group-level effects are modeled hierarchically:

  • \(u_j = u_{\text{raw}, j} \cdot \text{sd}_{\text{team}}\), with \(u_{\text{raw}, j} \sim \mathcal{N}(0, 1)\);
  • \(\text{sd}_{\text{team}} \sim \text{Uniform}(0, 100)\), allowing for flexibility in group-level variability.

We deliberately do not center the group-level effects in this model. As a result, the global intercept betaX[1] represents the overall baseline productivity across all teams, rather than being adjusted to reflect a mean-zero constraint on team effects. This modeling choice allows each u[j] to be interpreted as the absolute deviation of team j from the global baseline, which is especially valuable when the goal is to compare team performance directly.

The team-level effects are modeled using a non-centered parameterization, where each u[j] is expressed as the product of a standard normal variable and a shared standard deviation parameter sd_team. This implies the prior \(u_j \sim \mathcal{N}(0, \text{sd_team}^2)\) with sd_team following a weakly informative uniform prior, allowing the data to inform how heterogeneous the teams are. The non-centered approach enhances sampling efficiency and numerical stability, particularly when the variance across teams is low or only weakly informed by the data. Overall, this parameterization makes the estimation process more stable and the interpretation of team effects more transparent. In the previous model, where team effects were centered, most team-specific estimates hovered around zero with few statistically distinguishable deviations. By removing this constraint, we aim to explore whether more meaningful absolute differences emerge when each team’s deviation from the global baseline is modeled directly.

To maintain identifiability, the prior on the intercept \(\beta_X[1] \sim \mathcal{N}(0, 0.1)\) is tighter, which regularizes its value and prevents it from absorbing excessive variance. All other coefficients, including splines and interaction terms, receive weakly informative priors \(\mathcal{N}(0, 1)\), promoting flexibility while avoiding overfitting.

The residual standard deviation \(\sigma\) is assigned a broad prior \(\text{Uniform}(0, 100)\) and the degrees of freedom \(\nu\) of the Student-t distribution are given an exponential prior \(\nu \sim \text{Exponential}(1)\).

In summary, this model combines additive splines, interaction terms and random effects within a Student-t framework to provide a robust, interpretable and flexible modeling approach informed by both prior knowledge and data-driven structure.

3.2.1 Model Definition and Setup

We now present the full specification of the extended Bayesian model, along with the corresponding data and initialization setup.

set.seed(123)
model_string <- "
model {
  for (i in 1:N) {
    mu[i] <- inprod(betaX[], X[i,]) +
             inprod(beta1[], Z1[i,]) +
             inprod(beta2[], Z2[i,]) +
             inprod(beta3[], Z3[i,]) +
             u[team[i]]

    y[i] ~ dt(mu[i], tau, nu)
    y_rep[i] ~ dt(mu[i], tau, nu)
  }

  sigma ~ dunif(0, 100)
  tau <- pow(sigma, -2)

  nu ~ dexp(1)

  betaX[1] ~ dnorm(0, 0.1)
  for (j in 2:KX) {
    betaX[j] ~ dnorm(0, 1)
  }
  for (j in 1:K1) {
    beta1[j] ~ dnorm(0, 1)
  }
  for (j in 1:K2) {
    beta2[j] ~ dnorm(0, 1)
  }
  
  for (j in 1:K3) {
    beta3[j] ~ dnorm(0, 1)
  }
  
  sd_team ~ dunif(0, 100)
  for (j in 1:J_team) {
    u_raw[j] ~ dnorm(0, 1)
    u[j] <- u_raw[j] * sd_team
  }

  tau_team <- pow(sd_team, -2)
}
"
writeLines(model_string, con = "model2.jags")
X <- model.matrix(~ targeted_productivity + idle_time_flag, data = data)

Z_overtime <- smoothCon(s(over_time, bs = "cr", k = 3), data = data, absorb.cons = TRUE)[[1]]$X
Z_smv <- smoothCon(s(smv, bs = "cr", k = 3), data = data, absorb.cons = TRUE)[[1]]$X
Z_interaction <- smoothCon(te(incentive, wip, bs = "cr", k = 3), data = data, absorb.cons = TRUE)[[1]]$X

jags_data <- list(
  y = as.vector(data$actual_productivity_nqt),
  X = X,
  Z1 = Z_overtime,
  Z2 = Z_smv,
  Z3 = Z_interaction,
  team = as.numeric(as.factor(data$team)),
  N = nrow(data),
  J_team = length(unique(data$team)),
  KX = ncol(X),
  K1 = ncol(Z_overtime),
  K2 = ncol(Z_smv),
  K3 = ncol(Z_interaction)
)
inits <- function() {
  list(
    betaX = rnorm(jags_data$KX, 0, 0.1),
    beta1 = rnorm(jags_data$K1, 0, 0.1),
    beta2 = rnorm(jags_data$K2, 0, 0.1),
    beta3 = rnorm(jags_data$K3, 0, 0.1),
    u_raw = rnorm(jags_data$J_team, 0, 1),
    sigma = runif(1, 0.1, 1),
    sd_team = runif(1, 0.1, 1),
    nu = runif(1, 2, 10)
  )
}

3.2.2 Parameter Selection and Model Sampling

We monitor the same key parameters as in the previous model: fixed effects, spline coefficients, team-level effects and their standard deviation. We also generate replicated responses (y_rep) for posterior predictive checks and compute the Deviance Information Criterion (DIC) to assess model fit.

params_diag <- c("betaX", "beta1", "beta2", "beta3", "u", "sd_team")
params_ppc <- c("y_rep")

jags_model <- jags(
  data = jags_data,
  inits = inits,
  parameters.to.save = c(params_diag, params_ppc, "deviance"),
  model.file = "model2.jags",
  n.chains = 3,
  n.iter = 10000,
  n.burnin = 2000,
  n.thin = 1,
  DIC = TRUE
)

samples_diag <- as.mcmc(jags_model)

As in the previous model, we save both the fitted model and MCMC diagnostics using saveRDS() for reproducibility and future analysis.

samples_diag <- readRDS("samples_diag2_simple.rds")
jags_model <- readRDS("jags_model2_simple.rds")

3.2.3 Posterior Predictive Check (PPC)

We assess model fit by comparing the observed data distribution with that of the posterior predictive simulations.

The PPC for Model 2 shows a substantial improvement over Model 1. The predicted distribution more closely aligns with the observed data, particularly in the central peak and the right tail. While some discrepancies remain in the left tail and around local modes, the model now captures the overall shape of the data much more faithfully.

This suggests that the introduction of the spline-based interaction between incentive and wip significantly improved the model’s ability to replicate the empirical distribution of productivity scores. The better alignment between predicted and observed densities confirms that Model 2 provides a more realistic generative process compared to the baseline specification in Model 1.

3.2.4 Model Fit: Deviance Information Criterion (DIC)

To complement the visual inspection from the PPC, we report the DIC value for Model 2 as a summary of model performance.

## DIC value: 160.9679

The previously discussed improvement is also reflected quantitatively, since the DIC dropped significantly from 346.0149 in Model 1 to just 160.9679 in Model 2. This large reduction in DIC confirms a substantially better trade-off between model fit and complexity and supports the conclusion that the inclusion of the smooth interaction between incentive and wip leads to a more accurate and parsimonious model.

3.2.5 Autocorrelation Diagnostics

We now report the autocorrelation diagnostics for Model 2, focusing on the first 30 lags to evaluate sampling efficiency and potential issues with chain mixing.

Overall, the autocorrelation drops off quickly for most parameters, indicating good mixing and effective sampling. The decay is particularly steep for the fixed effects and the random effect standard deviation sd_team, confirming that the non-centered parameterization continues to improve convergence properties. No persistent autocorrelation is observed, suggesting that thinning is not required and that the number of effective samples is adequate across monitored parameters.

3.2.6 Traceplots of Parameters and Posterior Density Overlays

To further assess MCMC convergence and sampling efficiency, we inspect both the traceplots and the posterior density overlays for the monitored parameters.

The traceplots reveal good mixing behavior across all chains, with no evident trends or drifts. The chains appear stationary and well-overlapped for each parameter, suggesting that the sampling process has sufficiently explored the posterior distribution. Even the group-level effects u[j] show stable and consistent traces, supporting the effectiveness of the non-centered parameterization adopted in the model.

The posterior density overlays confirm that the marginal posterior distributions from each chain align almost perfectly. The densities exhibit strong overlap and smooth unimodal shapes, which further indicates that all chains are sampling from the same distribution.

3.2.7 Inference and Parameter Interpretation

To assess the stability and uncertainty of the estimated parameters, we present a comprehensive summary of their posterior distributions alongside graphical representations of their 95% credible intervals.

Posterior Summary
variable mean median sd mad mcse_mean mcse_sd rhat ess_bulk ess_tail 2.5% 25% 50% 75% 97.5%
beta1[1] -0.093 -0.092 0.035 0.034 0.000 0.000 1.001 6595.368 9999.419 -0.164 -0.116 -0.092 -0.069 -0.026
beta1[2] -0.037 -0.037 0.049 0.048 0.001 0.000 1.000 6086.241 9267.051 -0.132 -0.070 -0.037 -0.005 0.059
beta2[1] 0.069 0.069 0.032 0.032 0.001 0.000 1.001 2847.908 5595.702 0.004 0.047 0.069 0.090 0.131
beta2[2] -0.763 -0.757 0.162 0.149 0.002 0.002 1.000 4667.709 6538.316 -1.121 -0.858 -0.757 -0.656 -0.461
beta3[1] -0.185 -0.183 0.062 0.060 0.001 0.001 1.002 2143.030 3474.968 -0.313 -0.224 -0.183 -0.143 -0.068
beta3[2] -0.459 -0.439 0.157 0.131 0.002 0.003 1.001 4879.730 5313.212 -0.835 -0.535 -0.439 -0.356 -0.213
beta3[3] -0.140 -0.141 0.047 0.047 0.001 0.000 1.000 4989.977 8729.665 -0.233 -0.172 -0.141 -0.109 -0.046
beta3[4] 0.366 0.364 0.104 0.099 0.002 0.001 1.002 2620.341 4690.300 0.168 0.298 0.364 0.431 0.580
beta3[5] -0.074 -0.075 0.056 0.056 0.001 0.000 1.001 5362.729 9130.119 -0.183 -0.112 -0.075 -0.036 0.039
beta3[6] -0.046 -0.032 0.484 0.478 0.008 0.004 1.000 4127.047 7788.270 -1.030 -0.362 -0.032 0.280 0.868
beta3[7] 1.791 1.790 0.062 0.061 0.001 0.001 1.001 3824.381 6858.734 1.671 1.749 1.790 1.832 1.914
beta3[8] 2.021 2.027 0.153 0.148 0.003 0.002 1.001 3031.710 4471.204 1.702 1.925 2.027 2.125 2.305
betaX[1] 0.022 0.022 0.019 0.017 0.001 0.000 1.002 1220.384 2196.566 -0.014 0.011 0.022 0.033 0.060
betaX[2] 0.320 0.320 0.011 0.011 0.000 0.000 1.001 3413.679 5563.827 0.298 0.313 0.320 0.328 0.341
betaX[3] -0.258 -0.224 0.184 0.136 0.002 0.002 1.000 7918.712 8162.251 -0.754 -0.326 -0.224 -0.140 0.011
sd_team 0.054 0.051 0.019 0.017 0.000 0.000 1.002 1619.544 2350.143 0.025 0.041 0.051 0.065 0.099
u[1] 0.026 0.025 0.034 0.033 0.001 0.000 1.001 3483.075 6496.685 -0.038 0.004 0.025 0.048 0.096
u[10] -0.050 -0.049 0.028 0.027 0.001 0.000 1.001 2871.370 4984.004 -0.108 -0.067 -0.049 -0.030 0.003
u[11] -0.038 -0.036 0.034 0.033 0.001 0.000 1.001 2629.598 4520.949 -0.108 -0.059 -0.036 -0.014 0.024
u[12] -0.018 -0.017 0.031 0.030 0.001 0.000 1.001 2310.847 3525.315 -0.085 -0.038 -0.017 0.003 0.039
u[2] 0.025 0.024 0.029 0.028 0.001 0.000 1.001 2653.529 5007.223 -0.030 0.005 0.024 0.044 0.086
u[3] -0.028 -0.027 0.027 0.027 0.001 0.000 1.001 2501.063 4806.296 -0.084 -0.046 -0.027 -0.010 0.024
u[4] 0.066 0.064 0.034 0.034 0.001 0.000 1.000 3238.721 5695.006 0.004 0.042 0.064 0.088 0.137
u[5] -0.010 -0.011 0.030 0.028 0.001 0.000 1.001 3399.344 5705.707 -0.069 -0.029 -0.011 0.009 0.049
u[6] -0.010 -0.009 0.028 0.026 0.001 0.000 1.001 2374.906 3774.745 -0.067 -0.027 -0.009 0.008 0.041
u[7] 0.054 0.053 0.034 0.033 0.001 0.000 1.000 3124.081 5986.409 -0.006 0.031 0.053 0.076 0.127
u[8] 0.040 0.038 0.032 0.031 0.001 0.000 1.000 3023.314 5615.285 -0.019 0.018 0.038 0.060 0.109
u[9] -0.056 -0.054 0.027 0.027 0.001 0.000 1.001 2486.602 4443.467 -0.114 -0.073 -0.054 -0.037 -0.006

All parameters display excellent convergence properties: R-hat values are tightly concentrated around 1.000 and effective sample sizes (both bulk and tail) are consistently high often exceeding 3000, indicating well-mixed chains and reliable posterior estimates.

The intercept (betaX[1]) is small and centered around zero (mean = 0.022, 95% CI: [–0.014, 0.060]), as expected in a model where the random team-level effects are not constrained to sum to zero. This reflects the fact that the global baseline is absorbed through the intercept without centering.

Among the linear predictors, betaX[2] which encodes targeted_productivity has a strong positive effect (mean = 0.320, 95% CI: [0.298, 0.341]), suggesting a stable influence over time. In contrast, betaX[3] (idle_time_flag) displays a broad credible interval containing zero (mean = –0.258, CI: [–0.754, 0.011]), indicating weak evidence for its effect.

The spline coefficients for over_time (beta1), smv (beta2) and the interaction between wip and incentive (beta3) capture complex non-linear relationships. As with all spline terms, the individual coefficients are not directly interpretable in isolation, but their patterns and magnitudes offer insight. Several terms have posterior means far from zero with narrow credible intervals. Notably, beta3[7] and beta3[8] show strong positive effects (means ≈ 1.79 and 2.02, respectively), indicating a strong non-linear contribution of the interaction term to productivity. Also, beta3[4] and beta2[1] exhibit a moderate positive effect, with posterior means of approximately 0.366 and beta2[1] respectively. Likewise, beta2[2] has a large negative mean (–0.763, CI: [–1.121, –0.461]), revealing an important dip in the functional relationship between smv and productivity. This suggests that while smv may have a slight positive effect in certain regions, the overall influence is predominantly negative. Moreover, beta1[1] shows a slight negative effect (posterior mean = –0.093), suggesting that this predictor may have a minor negative influence on the target variable. Finally, beta3[1], beta3[2] and beta3[3] show negative effects, with posterior means of –0.185, –0.459 and –0.140, respectively. This suggests that the interaction term may exhibit heterogeneous behavior: in some regions it appears positive, in others negative and in others not significant.

Interestingly, the spline coefficient beta3[6] has a very wide credible interval (mean = –0.046, 95% CI: [–1.030, 0.868]), suggesting this component may be unidentifiable or weakly supported by the data, possibly due to multicollinearity or limited variation in the spline basis at that location.

The estimated standard deviation of the team effects (sd_team) is moderate and tightly estimated (mean = 0.054, 95% CI: [0.025, 0.099]), confirming the presence of non-negligible heterogeneity among teams.

The team-specific intercepts u[j] are not centered and should be interpreted as absolute deviations from the global baseline. Several team effects exhibit credible intervals that exclude zero, indicating statistically meaningful deviations in productivity. In particular, u[4] (mean = 0.066, CI: [0.004, 0.137]) and u[7] (mean = 0.054, CI: [–0.006, 0.127]) suggest above-average productivity, while u[9] (mean = –0.056, CI: [–0.114, –0.006]) reflects a significant underperformance. These differences persist even after controlling for all included covariates.

Parameters with credible intervals that exclude zero and are therefore deemed statistically relevant include:
- Linear and spline terms betaX[2], beta1[1], beta2[1], beta2[2], beta3[1], beta3[2], beta3[3], beta3[4], beta3[7], beta3[8]
- Group-level standard deviation sd_team
- Team effects u[4] and u[9]

These effects contribute most substantially to explaining productivity differences. The other parameters have higher posterior uncertainty and should be interpreted with caution, although they may still play important roles in shaping smooth response surfaces or improving model flexibility and predictive performance.

3.3 Model 3: Non-Centered Hierarchical Model with Non-Linear Main Effect Incentive and Spline Interaction WIP × SMV

This third model builds on the structure and insights gained from Models 1 and 2 and aims to further refine the representation of nonlinear relationships by modifying the specification of the smooth terms.

As in the previous models, we use a Student-t distribution for the observed outcomes, which is robust to outliers and heavy tails: \[ y[i] \sim \text{Student-t}(\mu[i], \tau^{-1}, \nu) \] The predictor for each observation \(i\) is defined as: \[ \mu_i = \mathbf{X}_i^\top \boldsymbol{\beta}_X + \mathbf{Z}_{1,i}^\top \boldsymbol{\beta}_1 + \mathbf{Z}_{2,i}^\top \boldsymbol{\beta}_2 + \mathbf{Z}_{3,i}^\top \boldsymbol{\beta}_3 + u_{j[i]} \]

Here, the linear terms X include targeted_productivity and idle_time_flag, while the smooth components consist of a univariate spline Z1 for over_time, a bivariate tensor-product spline Z2 for the interaction between smv and wip and a univariate spline Z3 for incentive. This choice is motivated by our exploratory data analysis, which revealed that the interaction between smv and wip may exhibit complex nonlinear effects on productivity. While Model 2 previously included a spline-based interaction between incentive and wipwhich led to a substantial improvement in both fit and DIC, we observed that incentive alone also acts as a strong linear predictor. Nonetheless, we chose to retain a smooth (nonlinear) effect for incentive in the current specification for two main reasons. First, the prior results from Model 2 clearly indicate that incentive can exhibit nonlinear behavior when interacting with other variables, suggesting that its marginal effect may not be entirely captured by a purely linear term. Second, using a spline allows the model to flexibly adapt to any plateaus, thresholds or saturation effects that a linear term would fail to represent. Importantly, including incentive as a univariate smooth term enables us to disentangle the monetary effect from physical production parameters such as smv and wip, while still maintaining sufficient flexibility to capture its nonlinearity. This design balances interpretability with model performance and avoids the potential risk of overfitting associated with more complex interactions.

The prior for the intercept betaX[1] is set as \(\mathcal{N}(0, 0.01^{-1})\), corresponding to a variance of 100. This is a relatively weak prior that favors stability while allowing the intercept to absorb the global baseline across all teams. It is particularly important in our non-centered framework, where the random effects are not constrained to sum to zero. As a result, the intercept must capture the average productivity level across all observations. The remaining regression coefficients and spline weights betaX[2], beta1, beta2, beta3 are assigned standard normal priors \(\mathcal{N}(0,1)\), which provide mild regularization without being overly restrictive. We use broad \(\text{Uniform}(0, 100)\) priors for the residual scale parameter sigma and the team-level standard deviation sd_team, reflecting minimal prior information. The degrees of freedom nu of the t-distribution follow an \(\text{Exponential}(1)\) prior, which places more mass on small values and therefore allows for heavier tails, promoting robustness to outliers.

Team effects are modeled using a non-centered parameterization: \[ u[j] = u_{\text{raw}}[j] \cdot sd_{\text{team}} \quad {\text{where }} u_{\text{raw}}[j] \sim \mathcal{N}(0, 1), \] which we previously found to improve interpretability.

3.3.1 Model Definition and Setup

We now define the JAGS model, the list of monitored parameters and the function used to initialize the chains. The initialization is designed to promote convergence and stability during sampling, while remaining weakly informative.

In particular, we fix the intercept betaX[1] to a moderately positive value (0.25), which reflects the average productivity level observed in the data after standardization and provides a sensible starting point in our non-centered framework. The remaining regression coefficients and spline weights are initialized from a narrow normal distribution centered at zero, allowing for slight variation while avoiding extreme starting values. The team-level random effects u_raw are sampled from a standard normal distribution, consistent with their prior. The scale parameters sigma and sd_team are initialized from a uniform distribution on (0.1, 1), ensuring positivity and moderate variability. Finally, the degrees of freedom nu of the t-distribution are initialized from a uniform range between 2 and 10, which avoids extreme tail behavior at the beginning of sampling.

set.seed(123)
model_string <- "
model {
  for (i in 1:N) {
    mu[i] <- inprod(betaX[], X[i,]) +
             inprod(beta1[], Z1[i,]) +
             inprod(beta2[], Z2[i,]) +
             inprod(beta3[], Z3[i,]) +
             u[team[i]]

    y[i] ~ dt(mu[i], tau, nu)
    y_rep[i] ~ dt(mu[i], tau, nu)
  }

  sigma ~ dunif(0, 100)
  tau <- pow(sigma, -2)
  
  nu ~ dexp(1)

  betaX[1] ~ dnorm(0, 0.01)
  for (j in 2:KX) {
    betaX[j] ~ dnorm(0, 1)
  }
  for (j in 1:K1) {
    beta1[j] ~ dnorm(0, 1)
  }
  for (j in 1:K2) {
    beta2[j] ~ dnorm(0, 1)
  }
  for (j in 1:K3) {
    beta3[j] ~ dnorm(0, 1)
  }
  sd_team ~ dunif(0, 100)
  for (j in 1:J_team) {
    u_raw[j] ~ dnorm(0, 1)
    u[j] <- u_raw[j] * sd_team
  }

  tau_team <- pow(sd_team, -2)
}
"
writeLines(model_string, con = "model3.jags")
X <- model.matrix(~ targeted_productivity + idle_time_flag, data = data)

Z_overtime <- smoothCon(s(over_time, bs = "cr", k = 3), data = data, absorb.cons = TRUE)[[1]]$X
Z_incentive <- smoothCon(s(incentive, bs = "cr", k = 3), data = data, absorb.cons = TRUE)[[1]]$X
Z_interaction <- smoothCon(te(smv, wip, bs = "cr", k = 3), data = data, absorb.cons = TRUE)[[1]]$X

jags_data <- list(
  y = as.vector(data$actual_productivity_nqt),
  X = X,
  Z1 = Z_overtime,
  Z2 = Z_interaction,
  Z3 = Z_incentive,
  team = as.numeric(as.factor(data$team)),
  N = nrow(data),
  J_team = length(unique(data$team)),
  KX = ncol(X),
  K1 = ncol(Z_overtime),
  K2 = ncol(Z_interaction),
  K3 = ncol(Z_incentive)
)
inits <- function() {
  list(
    betaX = c(0.25, rnorm(jags_data$KX - 1, 0, 0.1)),
    beta1 = rnorm(jags_data$K1, 0, 0.1),
    beta2 = rnorm(jags_data$K2, 0, 0.1),
    beta3 = rnorm(jags_data$K3, 0, 0.1),
    u_raw = rnorm(jags_data$J_team, 0, 1),
    sigma = runif(1, 0.1, 1),
    sd_team = runif(1, 0.1, 1),
    nu = runif(1, 2, 10)
  )
}

3.3.2 Parameter Selection and Model Sampling

We monitor the same categories of parameters as before: regression terms, spline coefficients and group-level effects. We also include replicated outcomes (y_rep) for posterior predictive checks and the DIC for model comparison.

params_diag <- c("betaX", "beta1", "beta2", "beta3", "u", "sd_team")
params_ppc <- c("y_rep")

jags_model <- jags(
  data = jags_data,
  inits = inits,
  parameters.to.save = c(params_diag, params_ppc, "deviance"),
  model.file = "model3.jags",
  n.chains = 3,
  n.iter = 10000,
  n.burnin = 2000,
  n.thin = 1,
  DIC = TRUE
)

As before, we save both the model object and the diagnostics with saveRDS() for future access.

samples_diag <- readRDS("samples_diag3_simple.rds")
jags_model <- readRDS("jags_model3_simple.rds")

3.3.3 Posterior Predictive Check (PPC)

We now perform a posterior predictive check to visually compare the observed and predicted densities of the outcome variable.

The plot below shows a close alignment between the predicted and observed densities, particularly around the main mode of the distribution. Compared to the earlier models, this fit is visibly improved: the underestimation in the left tail observed in Model 1 has been largely corrected and the excessive smoothing of the central peak seen in Model 2 has been mitigated.

The predicted distribution still slightly overshoots the observed one in the region just below the mode, but the fit is clearly more faithful overall. The improved match across the entire range, including in the tails, confirms that this model captures both the central tendency and dispersion more effectively than previous specifications.

3.3.4 Model Fit: Deviance Information Criterion (DIC)

We also evaluate model fit using the DIC:

## DIC value: 150.8271

This model achieves a substantially improved DIC value of 150.8, representing a remarkable gain in fit compared to Model 1 and a notable improvement over Model 2. The reduction in DIC indicates that the current specification captures the underlying structure of the data more effectively while maintaining parsimony, confirming the benefit of the revised smooth terms and interaction structure.

3.3.5 Autocorrelation Diagnostics

We now examine the autocorrelation of the sampled parameters to assess the efficiency of the MCMC chains.

The autocorrelation decays rapidly for all monitored parameters, suggesting efficient sampling. Compared to previous models, the overall structure indicates similar or slightly better mixing, particularly for the spline coefficients and team-level effects.

3.3.6 Traceplots of Parameters and Posterior Density Overlays

We assess the quality of MCMC convergence by examining both trace plots and posterior density overlays for all monitored parameters.